# Toxicity Detection
Toxic Prompt Roberta
MIT
A RoBERTa-based text classification model for detecting toxic prompts and responses in dialogue systems
Text Classification
Transformers

T
Intel
416
7
Gptfuzz
MIT
A Roberta fine-tuned classification model for evaluating the toxicity level of responses.
Text Classification
Transformers

G
hubert233
2,578
12
Toxicitymodel
Apache-2.0
ToxicityModel is a fine-tuned model based on RoBERTa, designed to assess the toxicity level of English sentences.
Text Classification
Transformers English

T
nicholasKluge
133.56k
12
Reward Model Deberta V3 Large V2
MIT
This reward model is trained to predict which generated answer humans would prefer for a given question. Suitable for QA evaluation, RLHF reward scoring, and toxic answer detection.
Large Language Model
Transformers English

R
OpenAssistant
11.15k
219
Featured Recommended AI Models